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This  is  the  final  report  for  Contract  number 
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the  User  Systems  Engineering  Organization  at  Texas 
Instruments  in  Dallas,  Texas.  The  experiment  was 
conducted  and  the  data  were  analyzed  by  Virginia 
Polytechnic  Institute  and  State  University  human 
factors  engineering  personnel  under  Contract  number 
N66001-85-C-0254  . 
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technical  contract  monitor,  for  his  support  and 
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to  Drs .  Gerald  Birdwell,  Harry  Snyder,  William 
Muto,  Brian  Epps,  Ann  Hammer  and  Daniel  Donohoo  for 
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EXECUTIVE _ SUMMARY 


The  use  of  natural  language  human-*  computer 
dialogue  has  been  a  subject  of  much  discussion  in 
recent  years.  Menu-based  natural  language  (MBNL) 
provides  a  form  of  constrained  natural  language 
dialogue  for  human-computer  interaction  where 
natural  language  words  and  phrases  are  displayed  on 
the  screen  as  menu  items . 

Previous  research  on  cursor  devices  has  provided 
mixed  results  concerning  the  "be3t"  cursor  device 
and  no  firm  recommendations  were  available  for  use 
with  menu-based  natural  language  interfaces . 

This  study  was  developed  to  determine  the  best 
input  device  for  MBNL  interfaces  to  Naval  command 
and  control  databases.  Three  different  cursor 
control  and  input  devices  (trackball,  keyboard 


cursor 

keys,  and  search 

keys) 

were 

evaluated 

for 

use  in 

MBNL  interfaces . 

Another 

joal 

of  the  study 

was  to 

investigate  the 

effects 

of 

scrolling 

and 

query  length  on  user  performance. 

Eighteen  Operation  Specialists  from  the  Naval 
Ocean  Systems  Center  performed  typical  database 
query  tasks  using  MBNL.  A  within-sub jecfcs  design 
was  used  to  evaluate  their  performances  and 
preferences  while  using  each  of  three  input 
devices,  for  throe  menu  lengths,  with  scrolling  and 
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no  scrolling,  over  six  trial  periods. 

The  dependent  measures  included  total  task  time, 
error  frequency,  ratings  of  the  input  devices,  and 
rankings  of  the  input  devices.  The  subjects  were 
given  the  exact  queries  they  were  to  enter  since 
query  formulation  time  was  not  a  goal  of  this 
study . 

All  main  effects  were  found  to  be  statistically 
significant.  The  performance  times  for  the  three 
input  devices  showed  the  search  keys  to  be  slower 
than  the  cursor  keys  and  the  trackball.  The 
performance  time  for  the  no-scrolling  condition  was 
significantly  faster  than  for  the  scrolling 
condition.  There  was  also  a  significant  difference 
in  performance  time  due  to  query  length. 

The  trackball  was  the  most  preferred  device, 
while  the  cursor  keys  were  least  preferred. 
Overall,  query  construction  times  were  faster  with 
the  cursor  keys  and  the  trackball  than  with  the 
search  keys.  Performance  with  the  trackball  was 
apparently  asymptotic  by  the  end  of  the  first  trial 
block,  and  consequently  the  trackball  had  an  early 
advantage  over  the  cursor  keys . 

The  results  suggest  that  users  perform  equally 
well  with  the  trackball  and  the  cursor  keys  and 
that  a  beginning,  intermittent,  or  infrequent 
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system  user  could  more  quickly  "get  up  to  speed" 
with  the  trackball  than  he  could  with  the  cursor 
keys  . 
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INTRODUCTION 


Natural  language  human-computer  dialogue  has  been 
a  subject  of  much  discussion  in  recent  years. 
Proponents  of  natural  language  human-computer 
dialogue  claim  that  it  has  several  advantages  over 
formal  command  language  dialogue  in  that  natural 
language  dialogue  is  versatile,  is  easy  to  use, 


does  not  require 

much 

up  front 

training. 

and 

permits 

the  possible 

use 

of  speech 

recognizers 

for 

input . 

Furthermore, 

users  do  not 

have  to  learn  a 

command  syntax  or  new  syntactical  rules,  thereby 
accommodating  the  inexperienced  user.  Shneiderman 
(1987)  argues  that  natural  language  human-computer 
dialogue  "...can  be  effective  for  the  user  who  is 
knowledgeable  about  some  task  domain  and  compute  r 
concepts  but  who  is  an  intermittent  user  who  cannot 
retain  the  syntactic  details"  (p .  166)  . 

Several  applications  of  restricted  scope,  such  as 
LUNAR,  SOPHIE,  ELIZA,  CHECKBOOK,  BASEBALL,  MARGIE, 
and  INTELLECT,  have  demonstrated  that  it  is 
possible  to  design  computer  programs  that  will 
accept  natural  language  instructions  to  accomplish 
particular  tasks  (Bobrow  &  Collins,  1975;  Brown, 
Burton,  &  Bell,  1975;  Ford,  1981;  Green,  Wolf, 
Chomsky,  &  Laughery,  1963;  Patrick,  1976;  Schank, 
1975;  Schank  &  Colby,  1973;  Suding,  1983; 
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Weizenbaum,  1966;  Woods,  1970) .  Experimental 
studies  of  natural  language  dialogue  have  included 
comparisons  between  natural  languages  and  query 
languages,  laboratory  studies  of  prototype  natural 
query  languages,  and  field  studies  of  prototype 
systems  (see  Damerau,  1981;  Egly  &  Weacourt,  1981; 
Eershman,  Kelly,  &  Miller,  1979;  Kaplan,  1982; 
Krause,  1979;  Miller,  Hershman,  &  Kelly,  1978; 
Shneiderman,  1978;  Small  &  Weldon,  1983;  Tennant, 
1980;  Waltz,  1977) .  Encouraging  results  have  been 
reported,  but  most  of  the  studies  also  indicate 
usability  problems. 

A  number  of  disadvantages  and  shortcomings  of 
natural  language  dialogue  have  been  described  (see 

Biermann,  Ballard,  &  Sigmon,  1983;  Hauptmann  & 
Green,  1983;  Lowden  &  DeRoeck,  1985;  Ogden  & 
Brook3,  1983;  Shneiderman,  1980,  1987;  Tennant, 

Ross,  &  Thompson,  1983;  Weizenbaum,  1966,  1976; 

Winograd,  1972).  Relatively  high  failure  rates, 
high  error  rates,  ease  of  use  problems,  and  user 
frustration  have  been  noted.  Some  have  argued  that 
natural  language  dialogue  leads  to  ambiguity  in  the 

formulation  of  queries  and  requests  and  that 

natural  languages  are  not  only  ambiguous  but  overly 

verbose.  Natural  language  systems  are  noted  to  be 
mysterious  about  their  coverage  and  capabilities, 
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and  natural  language  dialogue,  it  has  been  argued, 
leads  to  an  overestimation  of  computer  capabilities 
and  intimidation  of  the  user.  Features  of  natural 
language  systems  are  thus  often  not  used  because 
users  are  unaware  of  them  or  do  not  trust  them. 

The  desirability  of  natural  language  systems  for 
use  across  the  user  spectrum  and  user-system  task 
variety  has  been  questioned.  Despite  the  user 
interface  problems,  natural  language  dialogue  is 
generally  considered  preferable  for  inexperienced 
users.  For  knowledgeable  and  frequent  users  who 
are  thoroughly  aware  of  available  functionality, 
however,  a  concise  command  language  seems 
preferable.  Experts,  it  has  has  been  noted, 

generally  prefer  terse,  formal  command  languages. 

From  the  software  development  perspective,  there 
are  also  reservations  about  natural  language 
systems.  The  programs  must  handle  relatively  large 
grammars  and  lexicons,  and  the  code  required  to 
parse  and  translate  the  natural  language  input  can 
be  extensive  and  complex.  The  programs  typically 
require  "best  guess"  algorithms  to  handle  spelling, 
syntactic,  and  semantic  variations.  System 
resources  must  consequently  be  allocated  for 
recognizing  the  variant  syntactical  structures  and 
synonymous  terms.  Additionally,  resources  must  be 
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allocated  for  error  checking  and  clarification 
procedures.  Conventional  natural  language  systems 
are  thus  expensive  to  build  and  maintain,  and  they 
require  large  amounts  of  computer  memory. 

tteaa-.Baa.al  Natural _ laasmaafl 

As  an  alternate  form  of  human-computer 
interaction,  menu-based  natural  language  (MBNL) 
stands  at  the  middle  ground  between  the  restrictive 
formal  command  languages  and  unconstrained  free¬ 
form  natural  language.  MBNL  provides  a  form  of 
constrained  natural  language  dialogue  for  human- 
computer  interaction.  With  a  MBNL  interface, 

natural  language  words  and  phrases  are  displayed  on 
a  screen  as  menu  items .  The  user  constructs  a 
natural  language  sentence  by  selecting  the  menu 
items  with  a  pointing  device.  As  the  menu  items 
are  selected,  the  natural  language  sentence  is 
formed  in  a  window,  and  when  the  command  sentence 
is  complete,  the  sentence  is  sent  to  the  underlying 
application  program  for  execution. 

Work  in  the  area  of  MBNL  dialogue  has  shown 
promising  results  (Osga,  1984;  Tennant,  Ross, 
Saenz,  Thompson,  &  Miller,  1983;  Tennant,  Ross,  & 
Thompson,  1983;  Thompson  et  al.,  1983) .  MBNL 
interfaces  can  be  developed  relatively  quickly  and 
require  fewer  memory  resources  than  a  conventional 
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natural  language  system.  The  coverage  and 
limitations  of  an  MBNL  system  are  made  more 
apparent  to  the  user  due  to  the  use  of  a  restricted 
natural  language.  The  user  can  thus  avoid  the 
frustration  of  over  extending  beyond  the  limits  of 
system  functionality.  Since  MBNL  interfaces  are 
closed  and  manageable,  they  also  allegedly 
encourage  exploration  and  use  of  the  full  range  of 
system  resources.  Furthermore,  MBNL  interaction 
requires  only  the  use  of  a  pointing  device,  such  as 
a  mouse,  trackball,  or  lightpen.  If  a  keyboard  is 
used  for  input,  only  the  cursor  keys  and  enter  key 
are  required.  Typing  is  thus  eliminated  and  the 
user  is  guaranteed  a  semantically  and  syntactically 
correct  query  or  command  input . 


Cursor -.Sont-rol  .ana  Iapjit-.Ce.Yi sea 

Cursor  devices  may  be  indirect,  3uch  as  the 
cursor  key,  joystick,  trackball,  or  touchpad,  or 
direct,  such  as  the  lightpen  or  touch  screen 
(Ohlson,  1978)  .  The  few  studies  that  report 
experimental  comparisons  of  two  or  more  cursor 
devices  generally  indicate  that  the  direct  devices, 
such  as  the  lightpen  and  touch  screen,  perform  the 
best  of  all  devices  in  terms  of  task  completion 
time.  The  trackball,  however,  appears  to  be  the 
best  device  in  terms  of  accuracy.  Overall,  it 
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appears  that  the  mouse  and  the  trackball  are  the 
best  devices  across  a  variety  of  tasks  (Epps, 
1986) . 

English,  Engelbart,  and  Berman  (1967)  performed 
the  first  notable  study  aimed  at  the  comparison  of 
cursor  devices.  The  devices  included  a  mechanical 
mouse,  a  displacement  joystick  (absolute  and  rate 
modes) ,  a  lightpen,  a  graphacon,  and  a  knee-control 
device.  For  experienced  subjects,  the  mouse  had 
the  fastest  time  to  target  and  lowest  error  rata 
for  both  character  and  word  targets  .  For 

inexperienced  subjects,  the  knee-control  was  the 
best  device  for  time  to  target,  while  the  mouse  had 
the  lowest  error  rate . 

Mehr  and  Mehr  (1972)  compared  several  joystick 
and  trackball  configurations  for  a  simple  target 
acquisition  task.  The  joystick  was  studied  under 
four  configurations,  including  force  (rate  mode), 
force  (rate  mode  with  thumb  operation) , 
displacement  (rate  mode),  and  displacement 
(absolute  mode) .  The  trackball  was  also  studied 
under  four  conditions  with  pulses  par  trackball 
revolution/grams  of  drag  force  ratios  of  209/50, 
209/35,  409/57,  and  409/35.  The  results  showed 

that  the  409/35  trackball  configuration  and  the 
force  (rate  mode)  joystick  were  the  best  devices  on 
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time  to  position,  error,  and  learning  curves. 

Goodwin  (1975)  compared  a  lightpen  and  a  lightgun 
to  keyboard  text  kv»ys  for  three  simulated  word 
processing  tasks,  including  arbitrary  cursor 
positioning,  sequential  cursor  positioning,  and 
check  reading.  The  results  showed  that  the  two 
lightpen  devices  were  significantly  faster  than 
keyboard  text  keys  for  trial  completion  time. 

Card,  English,  and  Burr  (1978)  performed  an 
experimental  comparison  of  a  mechanical  mouse,  a 

force  joystick  (rate  mode),  cursor  keys,  and  text 
keys  for  a  simulated  word  processing  task.  Target 
size  and  target  distance  were  also  manipulated. 
There  were  four  target  sizes  of  1,  2,  4,  and  10 
characters,  and  five  target  distances  of  1,  2,  4, 

8,  and  16  cm.  The  results  showed  that  the  time  to 
target  was  significantly  faster  for  the  mouse  and 
joystick  than  for  the  step  keys  and  text  keys 
across  all  target  sizes  and  distances.  Across 

target  size,  the  mouse  had  the  lowest  error  rate  of 

the  four  devices . 

Gomez,  Wolfe,  Davenport,  and  Colder  (1982) 
compared  a  trackball  and  touchpad  (absolute  mode) 
in  a  study  performed  at  the  Naval  Ocean  Systems 

Center  (NOSC) .  Half  the  subjects  were  trained  on 
the  system  with  a  trackball  and  half  hc-d  no 
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experience  on  either  device.  No  difference  was 
found  between  the  touchpad  and  trackball  for  time 
to  target .  The  error  (distance  from  target  center) 
was  significantly  lower  for  the  trackball  across 
both  groups.  Additionally,  the  trained  subjects 
had  a  significantly  lower  error  rate  across  both 
devices . 

Albert  (1982)  performed  a  comprehensive 
comparison  of  devices  on  a  simple  target 
acquisition  task.  The  devices  included  a  touch 
screen,  a  lightpen,  a  touchpad  (with  "puck"),  a 
trackball,  a  displacement  joystick  (rate  mode),  a 
force  joystick  (rate  mode),  and  cursor  keys. 
Although  there  were  significant  differences  among 
the  devices,  no  post-hoc  test  results  were 
reported.  Based  on  means  for  positioning  speed, 
the  order  from  best  to  worst  cursor  device  was 
touchscreen,  lightpen,  touchpad,  trackball,  force 
joystick,  displacement  joystick,  and  cursor  keys. 
For  positioning  accuracy,  the  order  was  trackball, 
touchpad,  force  joystick,  displacement  joystick, 
and  cursor  keys.  Subjective  ratings  were  also 
collected,  but  no  statistical  analyses  were 
performed  on  the  data.  However,  an  inspection  of 
the  mean  ratings  indicates  that  the  touchscreen, 
lightpen,  and  touchpad  were  considered  the  most 
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comfortable,  easiest  to  learn,  and  least  tiring  to 
use  . 

Following  the  development  of  a  touchpad  for  use 
in  the  Royal  Signals  and  Radar  Establishment  (RSRE) 
for  air  traffic  control  tasks,  Whitfield,  Ball,  and 
Bird  (1983)  compared  the  RSRE  touchpad  with  a 
trackball  and  a  touchscreen.  Only  the  findings 
from  one  of  three  reported  experiments  are 
described  here.  The  factors  of  interest  were 


device 

type 

and  target  size. 

Target  size  was 

varied 

from 

1.5  to  12  cm 

in 

1.5-cm  increments. 

Statistical 

test  results 

for 

time  to  target 

indicated  significant  differences  among  devices. 
Although  no  post-hoc  test  results  were  given,  the 
authors  reported  that  the  touchscreen  was  ranked 
the  fastest  and  the  trackball  the  slowest.  Again, 
though  no  post-hoc  test  results  were  reported,  the 
trackball  had  the  lowest  percentage  of  errors  and 
the  touchscreen  had  the  highest . 

Struckman- Johnson,  Swierenga,  and  Shieh  (1984) 
compared  a  displacement  joystick  (absolute  mode) ,  a 
trackball,  a  lightpen,  and  non-repeating  keyboard 
key3  on  a  simulated  text  editing  task.  Gender  was 
also  included  as  a  factor  in  the  study.  For  males, 
the  lightpen  and  trackball  yielded  faster  trial 
completion  times  than  either  the  joystick  or 


keyboard  keys.  For  females,  the  lightpen  yielded 
faster  trial  completion  times  than  all  other 
devices .  Males  performed  better  than  females  when 
using  the  cursor  keys,  joystick,  and  trackball,  but 
not  the  lightpen.  The  keyboard  keys  and  trackball 
resulted  in  lower  error  rates  than  the  joystick 
across  all  subjects.  Furthermore,  the  lightpen  and 
trackball  were  preferred  over  the  joystick  and 
cursor  keys. 

Haller,  Mutschler,  and  Voss  (1984)  compared  a 
lightpen,  touchpad  (absolute) ,  mouse,  trackball, 
repeating  cursor  keys,  and  a  speech  recognition 
device  on  a  simulated  word  processing  task. 
Subjects  were  allowed  to  choose  their  own  preferred 
control/display  gain  for  the  touchpad,  trackball, 
and  mouse.  The  lightpen  was  found  to  be  superior 
to  all  other  devices  and  voice  input  was  found  to 
be  inferior  to  all  other  devices  on  time  to  target. 
In  addition,  the  lightpen  and  cursor  keys  showed 
the  smallest  error  rate.  Of  all  devices  in  the 
study,  the  lightpen  was  the  most  preferred. 

Karat,  McDonald,  and  Anderson  (1984)  compared  a 
touchscreen,  an  optical  mouse,  and  keyboard  keys. 
Subjects  performed  a  menu-typo  target  acquisition 
task  embedded  within  two  applications,  including  a 
computer-based  telephone  aid  and  an  app . intment 


aid.  For  target  acquisition,  the  touchscreen  was 
superior  for  speed  and  the  keyboard  was  superior 
for  accuracy.  For  the  applications  tasks,  the 
touchscreen  was  superior  to  the  mouse  and  the 
keyboard  for  menu  selection.  Subjects  preferred 
the  touchscreen  and  keyboard  over  the  mouse  for 
performance  of  the  applications  menu  selection 
tasks . 

Epps  (1986)  compared  the  performances  of  an 
absolute  touch  pad,  a  relative  touchpad,  a  mouse,  a 
trackball,  a  force  joystick,  and  a  displacement 
joystick.  Prior  to  comparison,  the  devices  were 
optimized  for  display /control  dynamics  in 
independent  experiments.  The  devices  were  then 
tested  on  three  types  of  tasks:  target  acquisition, 
text  editing,  and  graphics.  Epps  found  a  wide 
variation  in  the  cursor  positioning  performance  of 
the  devices  on  the  three  types  of  tasks .  In 
general,  the  two  joysticks  performed  worse  on  the 
target  acquisition  and  graphics  tasks  than  the  two 
touchpads.  On  the  text  editing  task,  however,  the 
rate-controlled  joysticks  performed  better  than  the 
touchpads .  The  mouse  and  the  trackball  performed 
the  best,  without  exception,  across  all  tasks 
Additionally,  these  devices  were  the  most 
preferred . 
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Research  on  cursor  devices  has  provided  mixed 

results  concerning  the  "best"  cursor  device.  There 

is  general  agreement  that  touch  entry  devices 

(o.g.,  touch  screens,  lightpens)  are  best  when  fast 
acquisition  of  relatively  large  targets  is 
required.  In  other  words,  touch  entry  devices  are 
typically  fast  but  inaccurate .  There  is  a  lack  of 
agreement  on  the  most  accurate  device,  but  the 
mouse  or  trackball  appears  to  be  the  recommended 
device.  The  research  of  Epps  (1986)  indicates  that 
the  mouse  and  trackball  are  the  overall  "best" 

devices  for  a  variety  of  task  environments. 

Nevertheless,  no  firm  conclusions  can  be  drawn 
that  would  warrant  generalizations  to  menu-based 
natural  language  (MBNL)  interfaces,  particularly 
MBNL  interfaces  to  Naval  command  control  databases. 
Furthermore,  the  shipboard  environment  adds  its  own 
unique  set  of  requirements.  For  example,  a 

physically  stable  cursor  device  is  required.  Thus, 
a  mouse  can  be  ruled  out  as  an  alternative  since  it 
will  tend  to  slide  around  under  unstable 
conditions.  Consequently,  there  was  a  need  to 
determine  the  appropriate  cursor  device  to  meet  the 
unique  requirements  of  the  NOSC  MBNL  interface. 

This  study  was  conducted  to  compare  three 
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different  cursor  control  and  input  devices,  namely, 
a  trackball,  keyboard  cursor  keys,  and  search  keys. 
Search  keys  move  the  cursor  to  the  first  menu  item 

that  begins  with  the  keyed  letter.  For  example,  if 

a  menu  contains  the  items  ’’Count",  "Display"  and 
"List",  and  the  user  types  an  "L",  the  cursor  will 

jump  to  the  menu  item  "List". 

Additionally,  the  effects  of  scrolling  and  query 
length  were  investigated.  The  effect  of  scrolling 

was  investigated  with  a  scrolling  versus  no- 
ocrolling  manipulation.  The  effect  of  query  length 
was  evaluated  by  requiring  subjects  to  select  two, 
three,  or  four  menu  items  to  construct  queries. 


method 


Thera  were  18  Operation  Specialists  from  the 
Naval  Ocean  Systems  Center  who  participated  in  the 
experiment.  The  subjects  were  male  and  ranged  in 
age  from  24  to  43.  All  subjects  had  experience  on 
a  microcomputer  and  15  to  18  month*  of  tactical 
console  experience . 


Query  instructions .  There  were  six  sets  of  query 
instructions  developed  for  the  experiment,  with  36 


query  instructions  within  each  set 


Each 


instruction  set  was  produced  by  a  factorial 
combination  of  three  query  lengths  (1,  2,  or  3  menu 

items)  and  two  menu  lengths  (scrolling/no¬ 
scrolling)  .  The  query  instructions  were  worded 
without  syntactic  or  semantic  variation  from  the 
actual  menu  items  (Table  1,  Figures  1  and  2)  .  Full 
listings  of  the  queries  and  menu  items  are 

sits  1  1  *Vl1  A  1  VS  hv\vs/\r^y^!4  A  T3 
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Subjective  evaluations.  Subjects  rated  the  input 
device*  on  five  bipolar  scales .  The  scale  anchor 
points  included  accurate-inaccurate,  fast-slow, 
consistent-inconsistent ,  comfortable-uncomfortable, 
and  acceptable-unacceptable  (see  Appendix  C)  . 
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Subjects  rank-ordered  the  input  devices  on 
dimensions .  The  ranking  dimensions  included  most 
preferred-least  preferred,  fastest  selection  speed- 
slowest  selection  speed,  highest  accuracy-lowest 
accuracy,  and  most  comfortable-least  comfortable 
(see  Appendix  D)  . 

TABLE  1 

Example  Queries  for  Each  Combination  of  Query 
Length  and  Scrolling 


Query  Length  2 


No- scrolling : 

Count  Soviet  air 
Scrolling : 

List  downed  aircraft 
Query  Length  3 

No- scrolling : 

List  EA2B  reported  by  U.S.  Ticondaroga 
Scrolling : 

Count  downed  aircraft  within  50  nautical  miles 
Query  Length  4 


VI  ^  mm  ■  M  *111  »•  • 

nv  0WA.  WA  J.  • 

Display  dot  blinking  U,K,  air  controlling  jammer 
mission  whose  location  is  hook  location 
Scrolling : 

Display  symbol  normal  special  point  with  remote 
data  source 
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Software .  The  menu-based  natural  language 
interface  was  developed  using  NaturalLink™  (Texas 
Instruments,  1985a,  1985b) .  HaturalLink™  combines 

an  interactive  menu-based  system  with  a  semantic 
grammar  analysis  approach  to  natural  language 
processing  (where  sentences  are  parsed  according  to 
semantic  rather  than  syntactic  categories)  . 


1  VERB  I 

!  SETTING  | 

1  UNITS 

Count 

Blink 

Air 

Display 

Bright  1 

Soviet  air 

List 

Bright2 

Dot  Blinking 

Dot  Normal 

UK  air 

US  air 
EA2B 

ENVIRONMENT  | 

Inverse 

Sizel 

Size2 

Symbol  Normal 

_ * _ 

E2 

F-14 

S3A 

Platform 

1  ATTRIBUTES  I 

Controlling  friend  air  whose  location  is  quadrant  4 
Controlling  jammer  mission  whose  location  is  hook  location 
On  barcap 

Received  by  Link  1 1 

Received  by  US  Ticonderoga 

Reporting  hostile  air  whose  location  is  quadrant  4 


Figure  1.  Menu  Items  Visible  on  the  Work  Screen 
Without  Scrolling. 
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The  interaction  batween  the  user  and  application 
software  is  handled  by  a  window  manager,  a  parser, 


a  translator,  and  a  sessioner  (driver) .  The  window 
manager  runtime  controls  the  screen  displays  and 


Count 

Display 

List 

ENVIRONMENT 


I  SETTING  | 

Dot  blinking 
Dot  normal 
Inverse 
Sizel 
Size2 

Symbol  blinking 
Symbol  brightl 
Symbol  bright2 
Symbol  normal 


|  UNITS  I 
Soviet  surface 
US  surface 
CV-64 
Ticonderoga 
Track 

Hooked  track 
Track  7526 
PU  number  24 
Unknown 


I  ATTRIBUTES  | 

Whose  location  is  quadrant  1  and  controlled  by  CV-67 

Whose  location  is  quadrant  1  and  reported  by  FF-1023 

Whose  mission  is  AEW 

Within  50  nautical  miles 

With  remote  data  source 

With  Tomahawk  missiles 


Figure  2.  Menu  Items  Visible  on  the  Work  Screen  by 
Scrolling  to  the  Bottom  of  the  Windows. 

returns  inputs  from  the  windows  when  menu  items  are 
selected.  The  parser  receives  the  inputs  from  menu 
selections,  consults  grammar  and  lexicon  files,  and 
builds  a  parse  tree.  The  parse  tree  is  then  passed 
to  tha  translator  when  the  user  completes  and 
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enters  the  query.  The  translator  receives 
parse  tree,  maps  it  to  the  elements  of  the 
underlying  application  program,  and  passes  it  to 
the  sessioner.  As  the  user  builds  and  executes 
the  queries,  the  sessioner  coordinates  the 
interaction  among  the  parser,  translator,  and 
window  manager,  passing  control  among  these 
software  components  and  the  application.  The 
application  finally  calls  the  window  manager  to 
display  the  results  of  the  query. 

Calls  to  the  NaturalLink™  runtime  software  were 
made  by  a  program  written  in  Microsoft  FORTRAN 
version  4.0.  Additionally,  the  FORTRAN  routine 
received  key  codes  returned  from  the  window  manager 
and  performed  DOS  time  calls  on  each  return.  The 
time-stamped  key  codes  were  written  to  a  buffer  and 
ware  subsequently  written  to  disk  whenever  the  user 
executed  a  query. 

EaiUpmeELfc. 

Computer  system.  The  MBNL  software  and  keystroke 
capturing  software  were  run  on  an  NCR  PCS  computer 
with  8  MHz  clock  speed,  640K  memory,  a  20MB  hard 
disk,  an  EGA  graphics  board,  and  a  monochrome  NEC 
Multisync  monitor.  The  "return"  key  was  used  to 
"select"  the  particular  menu  item  highlighted  by 
the  cursor.  The  F8  key  was  designated  as  the  "back 
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key,  used  to  back  up 


up" 

selected  menu  item 
cursor  to  tbo  menu 
selected,  The  F10 
completed  . 


erase  a  previously 
from  the  query,  and  return  the 
from  which  the  item  had  been 
koy  was  used  to  "execute"  the 


Traekha  1 1  .  A  Measurement  Systems  Model  621 
trackball  (4-cm  diameter)  was  used  in  the  present 
study.  The  trackball  was  set  to  operate  at  a  0.8 
display/control  gair  (10  cm  of  cursor  movement  per 
360  deg  of  trackball  revolution  (trackball 
circumference  of  12.5  cm)).  The  gain  selected  for 

the  trackball  had  been  found  by  Epps  (1986)  to  be 
best  for  text  editing  tasks.  Also,  in  the  pilot 


testing 

phase 

of 

the 

present 

experiment, 

four 

subjects 

were 

asked 

to 

use 

the 

trackball  with 

the 

gain  varied 

over 

a 

wide 

range .  The  median 

preferred  gain  among  the  four  pilot  subjects  was 

0.8,  corroborating  the  desirability  of  the  selected 
trackball  gain. 

A  three-button  custom  keypad  was  used  with  the 
trackball..  The  left  button  was  designated  as  the 
"return"  key,  the  middle  button  was  designated  as 
the  "execute"  key,  and  the  right  button  was 

designated  as  the  "back  up"  key. 
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The  expt  riment  was  conducted  as&3X2X3X6 
within-sub jects  design.  rhe  first  factor  was  input 
device  type  with  three  levels  representing  the 
three  devices,  namely,  the  trackball,  the  keyboard 
cursor  keys,  and  the  search  keys  along  with  the 
cursor  keys .  The  second  factor  was  menu  length  with 
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Figure  3 . 


Experimental  Design. 
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two  levels  representing  scrolling  and  no-scrolling. 
The  third  factor  was  query  length  with  three  levels 
representing  queries  constructed  from  two,  three, 
and  four  menu  items.  Finally,  the  fourth  factor 
was  replication,  with  six  levels  representing  the 
six  replications  or  trial  blocks.  The  six 

conditions  produced  by  the  query  length  and 
scrolling/no-scrolling  factors  were  balanced  across 
input  device  and  the  six  replications  of  each 
condition  were  randomly  assigned  in  six  different 
sequences.  These  six  sequences  comprised  the  six 
sets  of  query  instructions  which  were  assigned  at 
random  for  three  series  of  six  subjects  each.  The 
experimental  design  is  depicted  in  Figure  3. 

The  dependent  measures  included  total  task  time, 
error  frequency,  ratings  of  the  input  devices,  and 
rankings  of  the  input  devices.  Total  task  time  was 
defined  in  terms  of  query  construction  time  as  the 
time  from  when  a  query  was  initiated  to  when  a 
query  was  executed. 

Procedure 

Subjects  were  first  given  general  instructions  to 
read.  These  instructions  included  all  of  the 
pertinent  information  about  the  experiment,  with 
the  exception  of  information  pertaining  to  the 
input  devices.  Following  the  general  instructions, 
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subjects  were  given  the  instructions  for  the  first 
input  device,  followed  by  four  practice  trials  for 
that  device.  Subjects  were  then  given  the 

instructions  for  the  second  input  device,  and  were 
allowed  four  practice  trials  with  that  device,  and 
likewise  with  the  third  input  device,  for  a  total 
of  12  practice  trials . 

The  notebook  containing  the  practice  queries  and 
the  queries  for  the  experimental  trials  was 
positioned  on  the  left  side  of  the  subjects' 
workstation.  Subjects  ware  instructed  to  turn  the 
page  in  the  query  instruction  notebook  and  become 
familiar  with  the  query  before  initiating  query 
construction.  The  purpose  of  this  instruction  was 
to  exclude  the  time  to  prepare  for  query 
construction  from  actual  query  construction  time. 

The  first  screen  presented  to  subjects  was  an 
initiation  screen,  which  prompted  the  subject  to 
"Press  Enter  to  Continue."  Pressing  "enter" 

started  the  timer  embedded  in  the  callii.g  program 
and  brought  up  the  work  screen  with  the  cursor 
positioned  on  the  first  menu  item  in  the  VERB 
window  (Figures  1  and  2)  .  Construction  of  the  query 
required  the  selection  of  appropriate  menu  items, 
which  was  accomplished  by  moving  the  cursor  with 
the  input  device  to  highlight  the  desired  item  and 
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than  pressing  "return."  When  a  menu  item  was 
selected,  it  was  added  to  the  query  in  the  results 
window  at  the  top  of  the  screen;  then  depending  on 
the  grammar  rules  in  effect,  the  cursor 
automatically  moved  to  the  first  menu  item  in  the 
next  appropriate  window  where  the  subject  could 
continue  constructing  the  query.  Once  a  query  was 
constructed,  the  subject  could  then  execute  the 
query  by  pressing  the  "execute"  key.  Execution  of 


the 

query 

signaled 

the  end  of  the 

trial,  stopped 

the 

timer 

in  the 

calling  program. 

and  brought  up 

the 

initiation  screen  again. 

During  query  construction,  subjects  could  back  up 


to  previous  selections  by 

press:*  ng 

the 

"back 

up" 

key  which,  as  described 

earlier, 

would 

erase  a 

previously  selected  menu 

item  from 

the 

query 

and 

return  the  cursor  to  the 

menu  from 

which 

the 

item 

had  been  selected.  This  option  allowed  subjects  to 
correct  any  errors  they  may  have  noticed  during 
query  construction.  Once  a  query  was  executed,  the 
subjects  could  not  back  up.  Error  correction  time 
was  included  in  the  total  query  construction  time. 

After  the  36th  trial  with  a  particular  device, 
the  subjects  wore  instructed  to  stop,  at  which  time 
they  completed  the  rating  scale  for  that  device. 
Upon  completing  the  rating  scale,  a  10  to  15  minute 
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rest  break  was  allowed.  After  all  108  trials  were 
■*J  completed,  the  devices  were  rank-ordered. 
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Total  Task  Tima 

A  four-way  analysis  of  variance  (ANOVA)  was 
conducted  on  total  task  time  (Device  by  Menu  Length 
by  Query  Length  by  Replication) .  All  main  effects 
were  found  to  be  significant  (Table  2) .  Most 
importantly,  there  was  a  significant  difference 
among  the  performance  times  of  the  three  input 
devices  (p.  =  0.0073)  (Figure  4).  A  Newman-Keuls 

test  showed  that  the  search  keys  were  slower  (13.73 
s)  than  the  cursor  keys  (12.14  s,  £  <  0.01)  and  the 
trackball  (12.17  s,  £  <  0.01) .  However,  there  was 
no  reliable  difference  between  the  cursor  keys  and 
the  trackball . 

The  performance  time  for  the  no-scrolling 
condition  was  significantly  faster  (9.81  s)  than 
for  the  scrolling  condition  (15.55  s,  p.  <  0.0001) 
(Figure  5)  .  There  was  also  a  significant  difference 
in  performance  time  due  to  query  length  (p.  < 
0.0001)  (Figure  6).  A  Newman-Keuls  test  showed 
that  two  item  queries  (6.43  s)  were  performed  more 
quickly  than  three-item  (12.61  s,  j>.  <  0.01)  and 
four-item  (19.00  s,  g.  <  0.01)  queries,  and  three- 

item  queries  were  performed  more  quickly  than  four- 
item  queries  (p,  <  0.01). 
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TABLE  2 


ANOVA  Summary  Table  for  Total  Task  Tima 


SOURCE 

MS 

df 

F 

P 

amm 

SUBJECTS 

2703.23 

17 

— 

HiiaiM 

DEVICE 

555  .  C7 

2 

5.70 

0 . 0073 

DEV  x  SUB 

93.49 

34 

SCROLLING 

16049.98 

1 

76. 37 

0 . 0001 

SC  X  SUB 

210 . 16 

17 

QUERY  LENGTH 

25610.71 

2 

51.23 

0 . 0001 

QL  X  SUB 

499.87 

34 

TRIAL  BLOCK 

369 . 10 

5 

8 .84 

0 . 0001 

TB  X  SUB 

41.74 

85 

DEV  X  SC 

33 . 82 

2 

0.62 

0.5414 

DEV  X  SC  X  SUB 

54 . 12 

34 

DEV  X  QL 

68.29 

4 

1 .97 

0.1089 

DEV  x  QL  x  SUB 

34.67 

68 

DEV  x  TB 

36.43 

10 

0 .91 

0 . 5226 

DEV  x  TB  x  SUB 

39 . 90 

170 

SC  X  QL 

818 . 94 

2 

15 .14 

0 . 0001 

SC  x  QI.  X  SUB 

54 . 10 

34 

SC  X  TB 

47.74 

5 

1.71 

0 . 1409 

SC  X  TB  x  SUB 

27 . 91 

85 

QL  X  TB 

77 .30 

10 

2 . 95 

0 .0019 

QL  x  TB  x  SUB 

26.22 

170 

DEV  x  SC  X  QL 

43.84 

4 

0 .99 

0 .4181 

DEV  x  SC  x  QL  x 

SUB 

44.20 

68 

DEV  x  SC  X  TB 

45.89 

10 

1 .02 

0.4269 

DEV  x  SC  X  TB  X 

SUB 

44.90 

170 
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te  i  c 
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i  .  OO 

A  A  A  tn  r 

U  .  UJ33 

DEV  x  QL  X  TB  X 

SUB 

31 . 72 

340 

SC  X  QL  X  TB 

27 .66 

10 

0 .85 

0 .5825 

SC  x  QL  x  TB  x 

SUB 

32.59 

170 

DEV  x  SC  x  QL  x 

TB 

43.45 

20 

1 .12 

0 .3252 

DEV  x  SC  x  QL  x 

TB  X 

SUB  38.75 

340 

TOTAL 

1943 
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Figure  4 .  Main  Effect  of  Input  Device 
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Figure  6.  Main  Effect  of  Query  Length. 

Performance  time  was  further  observed  to  differ 
across  trial  blocks  (g,  <  0.0001)  (Figure  7)  .  A 
Nfcwman-Keuls  test  showed  that  performance  times  on 
trial  blocks  three  (12.13  s)  ,  four  (12.28  s)  ,  five 
(11.77  a),  and  six  (11.89  s)  were  significantly 
faster  than  performance  times  on  trial  blocks  one 
(14.37  3,  E.  <  0.01)  and  two  (13.64  s,  g  <  0.01). 
Subjects  had  apparently  reached  asymptotic 
performance  by  the  third  trial  block. 

There  was  a  significant  interaction  between  trial 
block  and  query  length  (g  *«  0.0019)  (Figure  8).  For 
three-  and  four-item  queries,  performance  times 
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Figure  7.  Main  Effect  of  Trial  Block. 


differed  across  trial  blocks  (Table  3)  .  A  Kowraan- 
Keuls  test  showed  that  three-item  queries  were 
significantly  faster  on  trial  blocks  five  and  six 

than  they  were  on  trial  blocks  one  through  four. 
Further,  four-item  queries  were  observed  to  be 
significantly  faster  on  trial  blocks  three  through 

six  than  they  were  on  trial  blocks  one  and  two 

(Table  4)  .  In  general,  then,  for  three-  and  four- 
item  queries,  there  was  a  measurable  decrease  in 
task  performance  time  with  practice  while  no 

improvement  was  observed  for  queries  requiring  only 
two  menu  selections . 


CL  4 


QL  3 


QL  2 


TRIAL  BLOCK 

Figure  8,  Trial  Block  by  Query  Length  Interaction. 


TABLE  3 

Simple-Effect  F-Tasta  on  Trial  Blocks  for  Each 
Query  Length 


Query  Length 


Two 
Three 
Four 


33.87 

114.90 

374.92 


1  .  29 

> 

0.05 

4 . 38 

< 

0 . 01 

14.30 

< 

0 . 01 

TABLE  4 


TABLE  6 


Simple -Effect 

Scroll  Level 

F-Tests  on  Query 

Length  for 

Each 

Scroll 

MSQL 

F 

P 

Yes 

17695.30 

327.06 

< 

0.01 

No 

8734.35 

161.44 

< 

0.01 

TABLE  7 

Newman-Keuls  Tests  on  Query  Length  for  Scroll  Level 


Scrolling  No-scrolling 


Query  Length 

Mean 

Query  Length 

Mean 

4 

22.85 

(A) 

4 

15 . 15 

(A) 

3 

15.74 

(B) 

3 

9.48 

(B) 

2 

8.07 

(C) 

2 

4 .78 

(C) 

NOTE:  Means  for  either  scrolling  level  sharing  a 

common  letter  in  parentheses  are  not  significantly 
different  (g  >  0.01)  . 

scroll  conditions  (Table  6)  .  A  Newman-Keuls  test 
showed  that  within  each  scrolling  condition, 
performance  times  increased  with  increasing  query 
length  (g  <  0.01)  (Table  7). 

Finally,  there  was  a  significant  three-way 
interaction  between  input  device,  query  length,  and 
trial  block  (g  -  0.0355).  For  each  device,  a  two- 
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TRIAL  BLOCK 

Figure  10.  Trial  Block  by  Query  Length 
Interaction,  Cursor  Kays 

item  query  was  faster  than  a  three-itom  query, 
wkich,  in  turn,  was  faster  than  a  four-item  query 
(Figures  10-12)  .  There  were  significant 

differences  in  task  performance  times  across  trial 
blocks  for  three  combinations  of  query  length  and 

input  device  (Table  8).  For  four-item  queries  with 
the  cursor  keys,  there  was  a  general  decrease  in 

task  performance  time  over  the  trial  blocks  (Figure 
10)  .  There  were  no  significant  changes  in 
performance  time  over  the  trial  blocks  for  any 
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Figure  11 .  Trial  Block  by  Query  Length 
Interaction,  Trackball. 
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query  length  with  tht,  trackball  (Figure  11)  .  For 
the  search  keys,  there  was  an  improvement  in  task 
performance  time  over  trial  blocks  for  four-item 
queries  (Table  8,  Figure  12)  .  A  Newman-Keuls  test 
shewed,  that  for  four-item  queries  with  the  search 
keys,  performance  time  was  significantly  slower  in 
the  first  trial  block  than  in  any  other  trial  block 
(Table  9).  There  was  no  significant  change  in 
performance  time  across  trial  blocks  for  three-item 
queries  with  the  search  keys. 
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Errors  w«i«  arranged  by  frequency  of  occurrence 


for  each  of  the  18  cells.  The  data  met  the 


requirements  for  a  Chi-Square  test;  however,  the 


total  Chi-Square  was  not  significant  (Chi-Square  = 


17.86,  df  *=  17,  >  0.05).  Thus,  there  were  no 


differences  in  error  frequency  attributable  to  any 


of  the  experimental  factors  of  interest 
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For  the  input  device  rating  scales,  a  one-way 
ANOVA  was  performed  on  each  of  the  scale 
dimensions.  There  were  no  reliable  differences 
found . 

A  Friedman  One-Way  Analysis  of  Variance  by  Ranks 
was  performed  on  each  dimension  of  the  device  rank- 
order  measure.  There  was  a  significant  difference 
in  device  preference  <  0.02)  .  The  trackball  was 

the  most  preferred  device,  while  the  cursor  keys 
were  least  preferred.  Finally,  a  rank  order 
difference  was  found  for  the  speed  dimension,  with 
the  trackball  ranked  as  the  fastest  device  and  the 
cursor  keys  as  the  slowest  (p,  <  0.001)  (Table  10). 

TABLE  8 


Simple- 

Effect  F-Tests 

on 

Trial 

Blocks 

for 

All 

Combinations 

of  Device 

and 

Query 

Length 

Device 

Query  Length 

mstb 

F 

P 

Cursor 

Keys 

2 

17 .37 

0 . 55 

> 

0.05 

Cursor 

Keys 

3 

63 .53 

2 . 00 

> 

0 . 05 

r'li 

A 

•% 

n  A  4  r  A 

.  O  U 

6.45 

< 

0 . 01 

Trackball 

2 

3.80 

0 . 12 

> 

0.05 

Trackball 

3 

38 . 62 

1 . 22 

> 

0 . 05 

Trackball 

4 

65.70 

2.07 

> 

0 . 05 

Search 

Keys 

2 

57.18 

1.80 

> 

0.05 

Search 

Keys 

3 

78 .12 

2.46 

< 

0 . 05 

Search 

Keys 

4 

280.31 

8.84 

< 

0 . 01 
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Newman-Kauis  Toata  on  Trial  Blocks  for  Query 
Lengths  Three  and  Four,  Search  Keys 


Query 

Length  3 

Query  Length  4 

Trial  Block  Mean 

Trial 

Block  Mean 

1 

14.97  (A) 

1 

25.49 

(A) 

2 

14.41  (A) 

2 

22 .81 

(AB) 

3 

14.15  (A) 

3 

19.43 

(BC) 

4 

13.12  (A) 

4 

19,40 

(BC) 

5 

13.11  (A) 

5 

19.01 

(BC) 

6 

10.82  (A) 

6 

18  .  !8 

(BC) 

NOTE:  Means  for  the  same  query 

common  letter  in  parentheses  are 
different  (e  >  0.01). 

length  sharing  a 
not  significantly 

TABLE  10 

Friedman 

Analysis  of  Variance 

for  Rank- 

Order 

Dimensions 

Rank  Sums 

Dimension 

Cursor  Trackball 

Search 

Chi-Square 

P 

Preference 

45  27 

36 

9.00  < 

0 . 020 

C  vs  A  A/I 

49  28 

31 

14.30  < 

0.001 

Accuracy 

35  38 

35 

0.33  > 

0.8Q0 

Comfort 

41  30 

37 

3.44  > 

0.050 

.  -I 
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Overall,  query  construction  times  were  faster 
with  the  cursor  keys  and  the  trackball  than  with 
the  search  keys.  An  initial  improvement  in  the 
performance  of  longer  queries  with  the  search  keys 
was  observed  (from  trial  block  1  to  trial  block  2), 
but  query  constrtxct  ion  time  showed  no  further 
improvement  with  practice  and  never  reached  the 
levels  obtained  with  the  other  devices .  Although 
errors  did  not  occur  any  more  frequently  with  the 
search  keys  than  with  the  cursor  keys  or  trackball, 
the  reliably  slower  query  construction  time  with 
the  search  keys  would  appear  to  rule  them  out  as  a 
primary  means  of  cursor  control.  If  search  keys 
are  permitted.  as  an  option,  their  use  should 
perhaps  be  limited  to  circumstances  where  there  are 
no  time-dependent  performance  requirements .  Under 
more  time-critical  conditions,  the  use  of  search 
keys  are  predicted  to  result  in  some  performance 
decrement . 

In  terms  of  overall  performance  times  and  error 
frequencies,  neither  the  cursor  keys  nor  the 
trackball  displayed  any  relative  disadvantage. 
Initially,  though,  query  construction  was  slower 
using  the  cursor  keys  to  build  longer  queries. 
Performance  with  the  cursor  keys  on  the  longer 
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queries  did,  however,  quickly  improve  (from  the 
first  to  the  second  trial  block) .  Interestingly, 
there  was  no  significant  change  in  performance  time 
across  trial  blocks  for  the  trackball  with  any 
query  length.  It  would  thus  appear  that 

performance  with  the  trackball  was  asymptotic  by 
the  first  trial  block,  and,  consequently,  the 
trackball  had  an  early  advantage  over  the  cursor 
keys.  This  might  be  relevant  for  any  beginning, 
intermittent,  or  infrequent  system  user,  insofar  as 
he  could  more  quickly  "get  up  to  speed"  with  the 
trackball  than  he  could  with  the  cursor  keys . 
Significantly,  even  though  the  trackball  was  not 
objectively  faster  than  the  cursor  keys  overall, 
the  subjects  perceivod  the  trackball  to  be  a  faster 
device.  Also,  since  the  subjects  preferred  the 
trackball  over  the  cursor  keys,  use  of  the 
trackball  may  facilitate  user  acceptance  of  the 
system . 


The  finding  that  queries  requiring  scrolling  took 


with  findings  from  studies  on  menu  breadth/depth 
tradeoffs.  These  studies  have  shown,  in  general, 
that  menu  selection  time  increases  with  greater 
search  depth.  In  effect,  scrolling  for  menu  items 
is  an  instance  of  searching  at  a  deeper  level  in  a 
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menu  than  is  the  case  whan  searching  for  a  menu 
item  that  does  not  require  scrolling.  As  query 
length  increases  and  the  search  for  menu  items 
requires  even  more  scrolling,  the  total  search 
"depth"  increases  and,  as  a  consequence,  query 
construction  time  increases.  Generally,  then,  the 

menus  should  be  limited  in  length,  to  the  extent 
possible  given  the  size  of  the  vocabulary  required 
by  the  domain . 

The  finding  that  longer  queries  taka  longer  to 
construct  is,  of  course,  predictable.  However,  as 
noted,  this  effect  will  be  exaggerated  as  the  task 
of  locating  the  target  items  requires  scrolling. 
Consequently,  there  is  a  tradeoff  between  query 
length  and  menu  length. 

To  decrease  query  lengths,  semantically  related 
items  from  different  menus  might  be  merged  together 
where  possible,  yielding  fewer  menu  items  required 
for  building  some  queries  .  The  drawback  will  be 
that  the  length  of  at  least  one  menu  must  increase, 
and,  consequently,  more  scrolling  will  be  required 
to  build  queries  with  that  menu.  On  the  other 

hand,  to  decrease  the  length  of  a  menu,  those  menu 

items  that  can  be  separated  into  different 
categories  could  be  placed  into  separate  menus . 

Alternatively,  one  might  consider  the  frequency 
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with  which  menu  items  can  be  included  as  a  part  of 
longer  queries.  Menu  items  frequently  included  in 
longer  queries  might  then  be  placed  within  separate 
shorter  menus,  and  less  scrolling  would 
consequently  be  required  for  building  longer 
queries.  Thus,  a  tradeoff  between  query  length  and 
menu  length  would  be  achieved,  with  a  reduction  in 
the  length  of  some  queries,  a  reduction  in  the 
length  of  some  menus,  and  an  overall  decrease  in 
the  amount  of  scrolling  required  to  build  longer 
queries  . 


SUMMARY  AND  CONCLUSIONS 


The  significance  of  this  study  is  in  the  human 
factors  recommendations  for  the  design  of 
workstations  for  Navy  Operations  Specialists  who 
will  use  menu-based  natural  language  interfaces . 
For  these  workstations,  the  input  device  tradeoffs 
should  consider  the  relative  performance  of  the 
cursor  keys  and  the  trackball  to  be  the  same. 
However,  the  users '  preference  for  the  trackball 
and  the  early  performance  benefits  in  learning  the 
menu-based  natural  language  task  gives  the 
trackball  an  advantage. 

The  screen  design  of  the  MBNL  is  a  difficult 
task  at  best.  While  design  issues  of  menu  size  and 
query  length  were  not  the  prime  focus  of  this 
research,  their  interaction  with  input  device 
performance  was  evaluated.  The  trackball 
demonstrated  an  initial  advantage  on  the  longer 
queries .  The  search  key  performance  improved  over 
time  reaching  asymptote  at  the  third  session. 
Research  to  provide  guidelines  for  quantitative 
tradeoffs  between  menu  length  and  query  length  is 
recommended . 

If  a  workstation  design  allows  use  on  only  one  of 
these  input  device,  the  clear  choice  is  the 
trackball.  If  the  task  requires  keyboard  entry, 
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and  cursor 


tha  combined  input  options  of  trackball 
keys  should  ba  provided.  Optionally,  providing 
search  keys  may  be  useful  to  certain  experienced 
individuals  performing  highly  learned  tasks . 
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Appendix  A : 


Menu  Items 


Count 

Display 

List 


blink 
bright 1 
bright2 
dot  blinking 
dot  normal 


xnverse 
sizel 
a  ize2 

symbol  blinking 
symbol  bright 1 
symbol  bright2 
symbol  normal 


air 

Soviet  air 
UK  air 
US  air 
EA2B 
£2 

£-14 

S3A 

platform 
special  point 
ASH  search  center 
downed  aircraft 
subsurface 
surface 
UK  surface 
neutral  surface 
Soviet  surface 
US  surface 
CV-64 


I T. 


W4 


track 

hooked  track 
track  7526 
PU  number  24 
unknown  suspect 


Appendix  A  continued 


ATTRIBUTE  Window 

controlling  friend  air  whose  location  is  quadrant  4 
controlling  jammer  mission  whose  location  is  hook 
location 
on  barcap 


received  by  link  11 
reported  by  US  Ticonderoga 

reporting  hostile  air  whose  location  is  quadrant  3 
whose  designation  is  force  FAAWC 

whose  designation  is  force  ID 

whose  location  is  quadrant  1 

whose  location  is  quadrant  1  and  controlled  by  CV-67 

whose  location  is  quadrant  1  and  reported  by  FF-1023 

whose  mission  is  AEW 

within  50  nautical  miles 

with  remote  data  source 

with  Tomahawk  missiles 
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Appendix  B : 


Queries 


Count  air 
Count  Soviet  air 
Count  UK  air 
Count  US  air 
Count  E2 
Count  EA2B 
Count  E-14 
Count  S3A 
Count  platform 
List  air 
List  Soviet  air 
List  UK  air 
List  US  Air 
List  E2 
List  EA2B 
List  F-14 
List  S3A 
List  platform 


Count  ASW  search  center 
Count  downed  aircraft 
Count  subsurface 
Count  surface 
Count  British  surface 
Count  neutral  surface 
Count  Soviet  surface 
Count  US  surface 
Count  track 
List  ASW  search  center 
List  downed  aircraft 
List  subsurface 
List  surface 
List  British  surface 
List  neutral  surface 
List  Soviet  surface 
List  US  surface 


List  air  controlling  friend  air  whose  location  .is 
quadrant  4 

List  air  controlling  jammer  mission  whose  location 
is  hook  location 

List  US  air  reporting  hostile  air  whose  location  is 
quadrant  3 

List  EA2B  reported  by  US  Ticonderoga 

List  platform  controlling  friend  air  whose  location 
is  quadrant  4 
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Appendix  B  continued 

List  platform  on  barcap 
Count  air  on  barcap 

Count  air  reported  by  US  Ticonderoga 
Count  US  air  controlling  friend  air  whose  location 
is  quadrant  4 

Count  US  air  reported  by  US  Ticonderoga 
Count  platform  controlling  jammer  mission  whose 
location  is  hook  location 

Count  platform  reporting  hostile  air  whose  location 
is  quadrant  3 
Display  brightl  air 
Display  size2  UK  air 
Display  dot  blinking  US  air 
Display  inverse  E2 
Display  dot  ncrmal  F-14 
Display  blink  platform 

QL-3  SCttIES 

Display  symbol  brightl  ASW  search  center 
Display  symbol  bright2  subsurface 
Display  symbol  normal  neutral  surface 
Display  brightl  US  surface 
Display  symbol  bright2  hooked  track 
Display  symbol  normal  PU  number  24 

List  special  point  whose  location  is  quadrant  1  and 
reported  by  FF-1023 

List  ASW  search  center  within  50  nautical  miles 
List  subsurface  with  remote  data  source 
List  surface  with  Tomahawk  Missiles 
List  US  surface  whose  designation  is  force  ID 
List  neutral  surface  whose  location  is  quadrant  1 
Count  special  point  whose  location  is  quadrant  1 
Count  downed  aircraft  within  50  nautical  miles 
Count  UK  surface  whose  designation  is  force  ID 
Count  neutral  surface  whose  designation  is  force 
FAAWC 

Count  Soviet  surface  whose  location  is  quadrar.c  1 
and  reported  by  FF-102  3 
Count  track  whose  mission  is  AEW 

Display  brightl  air  controlling  friend  air  whose 
location  is  quadrant  4 
Display  sizel  air  on  barcap 

Display  dot  blinking  air  reported  by  US  Ticonderoga 

Display  dot  normal  UK  air  controlling  friend  air 
whose  location  is  quadrant  4 

Display  dot  blinking  air  controlling  jammer  mission 

whose  location  is  hook  location 
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Appendix  B  continued 


M.  •',»  ! 


Display  3ize2  US  air  controlling  jammer  mission 
whose  location  is  hook  location 

Display  blink  US  air  on  barcap 

Display  bright2  US  air  reporting  hostile  air  whose 
location  is  quadrant  3 

Display  brightl  E2  reported  by  US  Ticonderoga 

Display  symbol  blinking  EA2B  reported  by  US 
Ticonderoga 

Display  inverse  EA2B  reporting  hostile  air  whose 
location  is  quadrant  3 

Display  sizel  F- 14  on  barcap 

Display  dot  normal  F-14  reporting  hostile  air  whose 
location  is  quadrant  3 

Display  bright2  S3A  reporting  hostile  air  whoso 
location  is  quadrant  3 

Display  size2  platform  controlling  friend  air  whose 
location  is  quadrant  4 

Display  inverse  platform  controlling  jammer  mission 
whose  location  is  hook  location 

Display  blink  platform  on  barcap 

0L=4  SC.=YES 

Display  symbol  bright2  special  point  whose  location 
is  quadrant  1  and  controlled  by  CV-67 

Display  symbol  normal  special  point  with  remote 
data  source 

Display  symbol,  brightl  ASW  search  center  within  50 

nautical  miles 

Display  symbol  bright2  downed  aircraft  whose 
mission  is  AEW 

Display  symbol  normal  subsurface  whose  designation 
is  force  ID 

Display  brightl  subsurface  with  remote  data  source 

Display  symbol  normal  surface  whose  location  is 
quadrant  1 

Display  symbol  bright2  surface  within  50  nautical 
miles 

Display  symbol  brightl  UK  surface  whose  designation 

i  e  rn 

J>a  Wtx  AW 

Display  symbol  normal  UK  surface  whose  designation 
is  force  FAAWC 

Display  symbol  brightl  US  surface  whose  location  is 

quadrant  1 

Display  symbol  bright2  TJS  surface  with  Tomahawk 

missiles 

Display  bright2  neutral  surface  whose  designation 
is  force  ID 

Display  symbol  normal  neutral  surface  with  Tomahawk 
missiles 
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Appendix  B  continued 


Display  symbol  bright2  Soviet  surface  whose 

location  is  quadrant  1  and  reported  by  FF-1023 
Display  symbol  brightl  track  whose  location  is 
quadrant  1  and  controlled  by  CV-67 
Display  symbol  brightl  track  whose  mission  is  AEW 
Display  symbol  normal  track  within  50  nautical 
miles 


Appendix  C:  Rating  Form  for  Devices 

SCALES  FOR  DEVICE 

Please  rats  the  device  you  have  just  used  on  the 
following  descriptive  scales: 


ACCURATE 

: - 1 - :  — 

— : - ; - 1 - : - ! 

INACCURATE 

EAST 

; - ! - ;  — 

— 1 ! - 1 - 1 - ! 

SLOW 

INCONSISTENT 

: - : - ; - 

CONSISTENT 

COMFORTABLE 

•  *  • 

#  .  ,  , 

UNCOMFORTABLE 

UNACCEPTABLE 


ACCEPTABLE 


Appendix  D : 


Ranking  Form  for  Devices 


RMKS&SS.  fl£  -  JLEEIZ3L  PSYICES 

Please  rank  the  input  devices  you  have  uaed  based  on  the 
following  criteria .  Simply  place  the  appropriate  letter  in 
the  desired  space . 

C  *  cursor  keys 

T  *  trackball 

Z  -  room/ cursor  key  combination 

Most  Preferred  _ 

Least  Preferred  _ 


Fastest  Selection  Speed 
Slowest  Selection  Speed 


Highest  Accuracy 
Lowest  Accuracy 


Most  Comfortable 
Least  Comfortable 
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